1 |
Between words and characters: A Brief History of Open-Vocabulary Modeling and Tokenization in NLP
|
|
|
|
In: https://hal.inria.fr/hal-03540069 ; 2022 (2022)
|
|
BASE
|
|
Show details
|
|
2 |
SIGMORPHON 2020 Shared Task 0: Typologically Diverse Morphological Inflection ...
|
|
|
|
BASE
|
|
Show details
|
|
3 |
SIGTYP 2020 Shared Task: Prediction of Typological Features ...
|
|
Bjerva, Johannes; Salesky, Elizabeth; Mielke, Sabrina J.; Chaudhary, Aditi; Celano, Giuseppe G. A.; Ponti, Edoardo M.; Vylomova, Ekaterina; Cotterell, Ryan; Augenstein, Isabelle. - : arXiv, 2020
|
|
Abstract:
Typological knowledge bases (KBs) such as WALS (Dryer and Haspelmath, 2013) contain information about linguistic properties of the world's languages. They have been shown to be useful for downstream applications, including cross-lingual transfer learning and linguistic probing. A major drawback hampering broader adoption of typological KBs is that they are sparsely populated, in the sense that most languages only have annotations for some features, and skewed, in that few features have wide coverage. As typological features often correlate with one another, it is possible to predict them and thus automatically populate typological KBs, which is also the focus of this shared task. Overall, the task attracted 8 submissions from 5 teams, out of which the most successful methods make use of such feature correlations. However, our error analysis reveals that even the strongest submitted systems struggle with predicting feature values for languages where few features are known. ... : SigTyp 2020 Shared Task Description Paper @ EMNLP 2020 ...
|
|
Keyword:
Computation and Language cs.CL; FOS Computer and information sciences
|
|
URL: https://arxiv.org/abs/2010.08246 https://dx.doi.org/10.48550/arxiv.2010.08246
|
|
BASE
|
|
Hide details
|
|
4 |
It’s Easier to Translate out of English than into it: Measuring Neural Translation Difficulty by Cross-Mutual Information ...
|
|
|
|
BASE
|
|
Show details
|
|
5 |
Linguistic calibration through metacognition: aligning dialogue agent responses with expected correctness ...
|
|
|
|
BASE
|
|
Show details
|
|
6 |
Processing South Asian Languages Written in the Latin Script: the Dakshina Dataset ...
|
|
|
|
BASE
|
|
Show details
|
|
7 |
It’s Easier to Translate out of English than into it: Measuring Neural Translation Difficulty by Cross-Mutual Information
|
|
|
|
In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (2020)
|
|
BASE
|
|
Show details
|
|
8 |
UniMorph 3.0: Universal Morphology
|
|
|
|
In: Proceedings of the 12th Language Resources and Evaluation Conference (2020)
|
|
BASE
|
|
Show details
|
|
10 |
The SIGMORPHON 2019 Shared Task: Morphological Analysis in Context and Cross-Lingual Transfer for Inflection ...
|
|
|
|
BASE
|
|
Show details
|
|
11 |
Spell Once, Summon Anywhere: A Two-Level Open-Vocabulary Language Model ...
|
|
|
|
BASE
|
|
Show details
|
|
13 |
Unsupervised Disambiguation of Syncretism in Inflected Lexicons ...
|
|
|
|
BASE
|
|
Show details
|
|
|
|